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'do' 

' Abstract. We propose to design data structures called succinct geometric indexes of negligible space 

(more precisely, o(n) bits) that, by taking advantage of the n points in the data set permuted and 
stored elsewhere as a sequence, to support geometric queries in optimal time. Our first and main 
result is a succinct geometric index that can answer point location queries, a fundamental problem in 
computational geometry, on planar triangulations in O(lgn) time 3 . We also design three variants of 
this index. The first supports point location using lg n + 2yTgn + 0(lg 1//4 n) point-line comparisons. 
The second supports point location in o(lgn) time when the coordinates are integers bounded by U. 
The last variant can answer point location in 0(H + 1) expected time, where H is the entropy of the 
query distribution. These results match the query efficiency of previous point location structures that 
use 0(n) words or 0(n lgn) bits, while saving drastic amounts of space. 
We then generalize our succinct geometric index to planar subdivisions, and design indexes for other 
types of queries. Finally, we apply our techniques to design the first implicit data structures that support 
point location in 0(lg 2 n) time. 
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1 Introduction 



> 

t^. ; The problem of efficiently storing and retrieving geometric data sets that typically consist of 
collections of data points and regions is fundamental in computational geometry. Researchers 
have designed many data structures to represent geometric data, and to support various types 
of queries, such as point location [29, 20, 17, 35], nearest neighbour [27], range searching [1] 
O and ray shooting [25]. 

Among these queries, planar point location is perhaps the most fundamental and thus 
has been studied extensively. Given a planar subdivision, the problem is to construct a data 
structure so that the face of the subdivision containing a query point can be located quickly. 
In the 1980s, various researchers [29, 20, 17, 35] showed that data structures of 0{n) words, 
where n is the number of vertices of the planar subdivision, can be constructed to support 
point location in 0(\gn) time, which is asymptotically optimal. 

Researchers have also considered improving the query efficiency of point location struc- 
tures under various assumptions. Several researches [23, 36] considered the exact number 
of steps (i.e. point-line comparisons) required to answer point locations queries. Seidel and 
Adamy [36] showed that there is an 0(n)-word structure that can answer point location 
in lgn + 2y / \gn + 0(lg^ 4 n) steps. Researchers later considered the case where the query 
distribution is known. If the probability of the i th face of the planar subdivision containing 
the query point is pi, the lower bound of the expected time of answering a query under 
the binary decision tree model is the entropy H = ^2{ = i(Pi^og 2 — ), where / is the number 



We use lg n to denote [~log 2 n\ . 
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of faces. When the planar subdivision is a planar triangulation, data structures of 0(n) 
words can be constructed to answer point location queries in 0(H + 1) [27] expected time 
or even using H + 0{\fH + 1) [3] expected comparisons per query. Recently, Chan [13] and 
Patra§cu [33] considered the case where the coordinates of the points are integers bounded 
by U < 2 W , and proposed a linear space structure that answers point location queries in 
0(min{lgn/ lg lgn, ^IgU}) time. 

As we have already seen, much work has been done to improve the query efficiency of 
point location. However, much less effort has been made to further reduce the storage cost. 
As a result of the rapid growth of geometric data sets available in Geometric Information 
Systems (GIS), spatial databases and graphics, many modern applications process geometric 
data measured in gigabytes or even terabytes. Although the above point location structures 
require linear space, the constants hidden in the asymptotic space bounds are usually large, 
so that they often occupy space many times the size of the geometric data. When the size of 
the data is huge, it is often impossible or at least undesirable to construct and store these 
data structures. Most data structures supporting other types of geometric queries are facing 
the same problem. 

Some attempts, however, have been made to improve the space efficiency of various 
geometric data structures. Goodrich et al. [23] showed that given a planar triangulation, a 
structure of sublinear space can be constructed to answer point location queries in 0(}gn) 
time. However, their approach assumed that the connectivity information (i.e. adjacencies) 
of the planar triangulation is given and stored elsewhere. This information can easily occupy 
much more space than that required to store the point coordinates (an adjacency list for a 
planar triangulation would take about 4n words), and can make the total space of the point 
location structure to be 0{n) words. By applying the idea of implicit data structures [31], 
researchers [8, 12] have designed some implicit geometric data structures. The idea is to 
store a permuted sequence of the point set, so that with zero or 0(1) extra space, geometric 
queries can be answered efficiently. The most recent result by Chan and Chen [12] showed 
that an implicit structure can be constructed to answer nearest neighbour query in the plane 
in 0(lg L71 n) time. This approach saves a lot of space, but there are still limitations. First, 
the above query time is not asymptotically optimal. Second, it is not known how to support 
point location in planar triangulations or planar subdivisions using implicit data structures. 

Have researchers tried all the major known techniques to design space-efficient geomet- 
ric data structures? The answer is no. There has been another line of research on data 
structures called succinct data structures. Succinct data structures were first proposed by 
Jacobson [28] to encode bit vectors, (unlabeled) trees and planar graphs in space close to the 
information-theoretic lower bound, while supporting efficient navigational operations. This 
technique was successfully applied to various other abstract data types, such as dictionaries, 
strings, binary relations [5, 6] and labeled trees [22, 5, 6]. It has also been applied to data 
structures related to computational geometry. There are succinct representations of planar 
triangulations and planar graphs [15, 10, 11, 4] that use 0{n) bits, and support queries such 
as testing the adjacency between two given vertices in constant time. However, they only 
encode the connectivity information, so they are succinct graph data structures rather than 
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succinct geometric data structures. It is not known how to combine them with point location 
structures without using 0(n) extra words or 0(n\gn) bits. 

In this paper, we propose to design succinct geometric data structures. Given a geometric 
data set, our goal is to store the coordinates of the points as a permuted sequence, and 
design an auxiliary data structure called succinct geometric index that occupies negligible 
space (more precisely, o(n) bits) to support various geometric queries in optimal time. There 
is similarity between this model and the permutation + bits model presented by Chan and 
Chen [12], but they are different. The latter was proposed only as an intermediate model for 
the design of implicit geometric data structures, in which 0{n) bits are allowed in addition 
to storing a permutation of the points. It also only allows bit probe operations to these bits, 
while we do not have such a restriction. 

1.1 Our Results 

We design succinct geometric indexes to answer point location queries. Our first and main 
result is that, given a planar triangulation, we can permute its points to store the coordinates 
in sequence, and construct a succinct geometric index that occupies o(n) bits to support 
point location in 0(\gn) time. The preprocessing time is 0(n). Based on this, we design 
three variants of this index. The first variant is a succinct geometric index that supports 
point location in lgn + 2^/\gn + Oilg 1 ^ n) steps, which matches the result of Seidel and 
Adamy [36] while using negligible space. The preprocessing is 0{n), which is an improvement 
upon the 0(n lgn) preprocessing time of the latter structure. The second variant is a succinct 
geometric index that supports point location in o(lgn) time when the coordinates are integers 
bounded by U. The last variant is a succinct geometric index that can answer point location 
in 0(H+1) expected time. These results match the query efficiency of previous point location 
structures that use 0(n) words or 0(n\gn) bits, while saving drastic amounts of space. 

We then generalize our approach to the case of planar subdivisions (we assume the 
subdivision is within a polygon boundary), and design an o(n)-bit index that supports point 
location on planar subdivisions in O(lgn) time. This immediately yields another succinct 
geometric index that can test whether a query point is inside a given polygon. We also use 
our techniques for point location to design a succinct geometric index that supports vertical 
ray shooting. Finally, we apply our succinct geometric indexes to design the first implicit 
data structures that support point location in 0(lg 2 n) time. All our results are under the 
word RAM model of 6>(lgn)-bit word size. 

2 Preliminaries 

2.1 Bit Vectors 

A key structure for many succinct data structures, and for the research work in this paper, 
is a bit vector B of length n that supports rank and select operations. We assume that the 
positions in B are numbered 1,2, ... ,n. For a G {0, 1}, the operator ranke(a:, x) returns the 
number of occurrences of a in B[l . . . x], and the operator selected:, r) returns the position 
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of the r occurrence of a in B. We omit the subscript B when it is clear from the context. 
Lemma 1 addresses the problem of succinctly representing bit vectors, in which part (a) is 
from Jacobson [28] and Clark and Munro [16], while part (b) is from Raman et al. [34]. 

Lemma 1. A bit vector B of length n with v Is can be represented using either: (a) n + o{n) 
bits, or (b) lg (") + 0(n lglgn/ lgn) bits, to support the access to each bit, rank and select 
in 0(1) time. 

2.2 Graph Separators 

Graph separators [30] have been extensively studied to partition graph into subgraphs, to 
allow divide and conquer. The variant of graph separators we use is called t-separators. Let 
G = {V, E} be a planar graphs of n vertices, where each vertex has a non-negative weight. A 
t-separator (0 < t < 1) of G is a subset, S, of V whose removal from G leaves no connected 
component of total weight more than w(G), where w(G) is the sum of the weights of the 
vertices of G. Aleksandrov and Djidjev et al. [2] have the following results: 

Lemma 2 ([2]). Consider a planar graph G with n vertices, whose vertices have nonneg- 
ative weights. For any t such that < t < 1, there is a t-separator consisting of 0{yjn/t) 
vertices that can be computed in 0(n) time. 

2.3 Encoding a Planar Triangulation by Permuting its Vertex Set 

Denny and Sohler [19] considered the problem of encoding the connectivity information of a 
planar triangulation by permuting its vertex set. They have the following result: 

Lemma 3 ([19]). Given a planar triangulation of n vertices where n > 1090, there is an 
algorithm that can encode it as a permutation of its point set in 0{n) time, such that it can 
be decoded from this permutation in 0{n) time. 

3 Point Location in Planar Triangulations 

In this section, we show how to design succinct geometric indexes to support point location 
queries on a planar triangulation G of n vertices, m edges and / internal faces. We define 
a planar triangulation to be a planar subdivision in which each face (including the outer 
face) is a triangle. For simplicity, we use the term planar triangulation to refer to both the 
triangulation itself (with coordinates), and the embedded abstract planar graph underlying 
it. We start with our scheme of partitioning the triangulation. We then show how to label its 
vertices (i.e. how to permute its point set). We finally design data structures and algorithms 
to support point location queries. 
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3.1 Partitioning a Planar Triangulation by Removing Faces 

In this section, we present an approach to partition a planar triangulation by removing a set 
of internal faces. 

We first define the following terms. In a planar triangulation, two faces of the graph are 
adjacent if they have a common edge. A face path is a sequence of the faces of the graph such 
that each two consecutive faces in this sequence are adjacent. An adjacent face component is 
a set of internal faces of the graph in which there exists a face path between any two faces 
in this set, and there is no face in this set that is adjacent to an internal face not in this set. 
We define the size of an adjacent face component to be the number of internal faces in it. 
With these we can define the notion of graph separators consisting of faces: 

Definition 1. Consider a planar triangulation with f internal faces. A t-face separator 

(0 < t < 1) of G is a set of its internal faces of size 0(y/ f/t) whose removal from G leaves 
no adjacent face component of more than tf faces. 

We have the following lemma: 

Lemma 4. Consider a planar triangulation G of f internal faces. For any t such that < 
t < 1, there is a t-face separator consisting of 0(y/f/t) faces that can be computed in 0{n) 
time. 

Proof. We consider graph G*, which is the dual graph of G excluding the vertex correspond- 
ing to the outer face of G and its incident edges. Then G* has / vertices. As G is a planar 
triangulation, G* is a simple planar graph. By Lemma 2 (we simply let the weight of each 
vertex be 1), there exists a t-separator, S*, for G*, consisting of 0(y/ f/t) vertices. We use 
S to denote the set of faces of G corresponding to the vertices in S*. Then S has 0(y/f/t) 
faces. As the removal of S* from G* leaves no connected component of more than tf vertices, 
the removal of S from G leaves no adjacent face component of more than tf faces. Therefore, 
S is a t-face separator of G. □ 

Figure 1 shows a sample adjacent face component. We define the boundary of an adjacent 
face component to be a set of edges in which each edge is shared by an internal face of 
this component and a face of the t-face separator. Thus the boundary of an adjacent face 
component consists of one or more simple cycles: one simple cycle which is the outer face of 
the adjacent face component, and at most one simple cycle corresponding to each adjacent 
face component inside it. The simple cycle corresponds to the outer face does not share an 
edge with any cycle inside; otherwise, the cycle inside is simply part of the outer face. A 
useful observation is that any two such simple cycles do not have a common edge, because 
otherwise, there are two faces of G sharing an edge that are in two different adjacent face 
components, which contradicts the definition of adjacent face components. 

As each adjacent face component obtained using Lemma 4 may have 1 to tn vertices, 
there may be as many as 0(n) of them. To further bound the number of subgraphs we 
partition the graph into, we have the following lemma: 
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Fig. 1. A typical adjacent face component, triangulated using dashed lines. The black triangles are separator triangles 
that are adjacent to the triangles of this component. The other parts of the graph are not shown. 



Lemma 5. Consider a planar triangulation G with f internal faces and a t-face separator S 
constructed using Lemma 4- The number of adjacent face components of G\S is 0(a/ '/ ft). 




Proof. As the number of adjacent face components is always less than the number of edges 
in their boundaries, and no edge is in the boundaries of two different components, we 
need only bound the number of such edges. Observe that these edges are from the set 
S U {{vq,Vi}, {vi,v n -i}, {v n -i,Vo}}, where i>o, V\ and v n are the three vertices on the outer 
face of G. Therefore, the number of such edges is at most linear in the size of S, which is 



A boundary vertex of an adjacent face component is a vertex on the boundary of the 
adjacent face component, and an internal vertex is a vertex inside it. A vertex belongs to 
an adjacent face component iff it is either a boundary vertex or an internal vertex of this 
component. The duplication degree of each vertex is the number of adjacent face components 
it belongs to. Thus the duplication degree of an internal vertex is 1. To bound the sum of 
the duplication degrees of boundary vertices, we have the following lemma: 

Lemma 6. Consider a planar triangulation G with f internal faces and a t-face separator 
S constructed using Lemma 4- The sum of the duplication degrees of all its boundary vertices 



Proof. Recall that the boundary of an adjacent face component consists of a set of simple 
cycles. Thus the duplication degree of a boundary vertex is the number of simple cycles it 
is in. Therefore, the sum of the duplication degrees of all the boundary vertices is equal to 
the number of edges in all such simple cycles. As each such edge is an edge of a face of S, 
and no edge exists in two different cycles, the number of such edges is at most three times 
the number of faces of S, which is 0(y/f/t). □ 






□ 



zsO(^fft). 




6 



3.2 The Two-Level Partitioning Scheme 

We now use Lemma 4 to partition the input graph G. Recall that G has n vertices and / 
internal faces. Thus / = 2n — 5. 

We first use Lemma 4 to partition G. We choose t = (lg a /)//, where a is a positive 
constant parameter that we will fix later. Then the t-face separator, S, has 0(y/f/t) = 
0(f/\g a/2 t) faces and thus 0(n/\g a/2 n) vertices. We call each adjacent face component of 
G\S a region. By Lemma 5, there are r = 0(y/ f /t) = 0(f / lg a ^ 2 1) regions. Each region 
has at most tf = lg a / internal faces, and thus 0(lg a n) vertices. We use Ri to denote the i th 
region of G (the relative order of regions does not matter). We call this the top-level partition 
of G. 

We perform another level of partitioning. For each region Ri, we triangulate the graph 
that consists of Ri and the triangular outer face of G, and we denote the resulting planar 
triangulation R[. Thus R[ has 0(lg a n) vertices. We partition R[ to smaller "regions" called 
subregions so that each subregion has 0(\g b n) vertices, where b is a positive constant pa- 
rameter smaller than a that we will fix later. We use R i: j to denote the j th region of Ri (the 
relative order of subregions in the same region does not matter). To do this, let rij be the 
number of vertices in Ri. If rij > \g b n (otherwise, the entire region is also a subregion and 
the separator has size 0), we choose U = (\g b n)/ni and use Lemma 4 to construct a tj-face 

separator, Si, for each region Ri. Then Si has 0(\J rii/ (lg 6 n/rii)) = 0{ni/ \g b ^ 2 n) vertices. 

The sum of the numbers of vertices in all the Si's is T,\ =x O{nij \g b ^ 2 n) = 0{nj \g b ^ 2 n). By 
Lemma 5, there are 0{^/n,i/ti) = 0(rii/ \g b ^ 2 n) subregions in Ri. Therefore, the total num- 
ber of subregions in G is E r i=l O{rii/ \g b ^ 2 n) = 0{n/ \g b ^ 2 n). We call this the bottom-level 
partition of G. 

3.3 The Labeling of the Vertices 

We now design a labeling scheme for the vertices based on the two-level partition in Sec- 
tion 3.2. This labeling scheme assigns a distinct number from the set [n] to each vertex x 
of the graph 4 . We call this number the graph-label of x. For each region Ri, this labeling 
scheme also assigns a distinct number from the set [rij] to each vertex x in Ri, where is 
the number of vertices in this region. We call this number the region-label of x. For each 
subregion R it j, a unique number from the set [rijj] is assigned to each vertex in Rij, where 
Uij is the number of vertices in this subregion. We call this number the subregion-label of x. 

Observe that, although each vertex x has one and only one graph-label, it may have zero, 
one or several region-labels or subregion-labels. This is because each vertex may belong to 
more than one region or subregion, or only belong to the t-separator of G or a tj-separator 
of a region Ri of G. 

We assign the labels from bottom up. We first assign the subregion-labels. Given subre- 
gion Rij, we use Lemma 3 to permute its vertices (we have to surround Rij using a triangle 

4 We use [i] to denote the set {1, 2, i}. 
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and triangulate the resulting graph to use this lemma, but as the vertices of the added tri- 
angle are always the last three vertices when permuted, this does not matter). If a vertex x 
in Ri j is the k th vertex in this permutation, then the subregion-label of x in 

To assign a region-label to a vertex x that belongs to region R iy there are two cases. 
First, we consider the case where x belongs to at least one subregion in R { . Let hi be the 
number of vertices in Ri that belongs to at least one subregion of Ri. We assign a distinct 
number from [hi] to each such vertex as its region-label in the following way: We visit each 
subregion Rij, for % = 1,2, • • ■ ,Ui, where Ui is the number of subregions in Ri. When we 
visit Rij, we check all its vertices sorted by their subregion-labels in increasing order. We 
output a vertex of Rij iff we have not checked this vertex before (i.e. it does not belong to 
subregions R it i, R i>2 , • • • , Rij-i). This way we output each vertex that belongs to at least one 
subregion in Ri exactly once, and we assign the number k to the k th vertex we output, and 
this number is its region-label in Ri. Second, we consider the case where x does not belong 
to any subregion in Ri. There are rij — hi such vertices. We assign a distinct number from 
{hi + 1, hi + 2, • • • , rii} to each of them in an arbitrary order, and the numbers assigned are 
their region-labels. 

We assign graph-labels to the vertices using an approach similar to the one in the previous 
paragraph. More precisely, we permute the h vertices that have region-labels in the order we 
first visit them when we check all the vertices by region. The k th vertex in such a permutation 
has graph-label k. We assign a distinct graph-label from {h + 1, h + 2, • • • , n} to the rest of 
the vertices of the graph in an arbitrary order. 

We now show how to perform constant-time conversions between the graph-labels, region- 
labels and subregion-labels of the vertices of G. We have the following lemma: 

Lemma 7. There is a data structure of o(n) bits such that given a vertex x as a subregion- 
label k in subregion Rij, the region-label of x in Ri can be computed in 0(1) time. Similarly, 
there is a data structure of o{n) bits such that given a vertex x as a region-label k in region 
Ri, the graph-label of x can be computed in 0(1) time if a > 2. 

Proof. We first show how to prove the first claim of this lemma. 

Recall that we use Ui to denote the number of subregions in Ri and hi to denote the 
number of vertices in Ri that have subregion-labels. We denote the number of regions of G 
by r. We denote the number of vertices of subregions R it i, R i>2 , ■ ■ ■ , Ri, Ui by n^\, n i}2 , ■ ■ ■ , n i>Ui , 
respectively, where Ui is the number of subregions in Ri. Let n\ be Y^j=i n i,j- As 110 internal 
vertex of a subregion occurs in another subregion, in order to bound n^, we need only consider 
the boundary vertices of the subregions in Ri. Then by Lemma 6, we have n[ = hi + 
0( ^~/ti) =K + 0(m/ lg fc / 2 n) < n t + 0(m/ \g b/2 n). 

We consider a conceptual array, A iy of length n! i for each region Ri. It is the concatenation 
of the following conceptual arrays. For each subregion Rij, we construct a conceptual array 
Aij in which stores the region-label of the vertex in Rij whose subregion-label is k. 

Then A t = A it iA it2 ■ ■ -A itU .. Clearly has the answers to our queries, but we do not store 
it explicitly. Instead, we construct the following data structures for each region Rf. 
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— A bit vector £>j[l..r^] which stores the numbers n it i, 7ij )2 , • • • , n ijUi in unary, i.e. Bi = 

— A bit vector Cj[l..n^] in which Ci[k] = 1 iff the first occurrence of the region-label A{[k] 
in A is at position k (let qi denotes the number of Os in Cj); 

— An array Djl..^] in which Dj\k] stores the region-label of the vertex that corresponds 
to the k th in Q, i.e. D^k] = Aj[select Ci (0, k)]. 

To analyze the space cost of the above data structures constructed for the entire graph 
G, we first show how to store all the BiS. The first step is to concatenate all the BiS and store 
them as a single sequence B, and use part (b) of Lemma 1 to represent B. Let y be the length 
of B and z be the number of Is in B. Then B can be stored in lg ( y ) + 0(y\g\gy / \gy) bits. 
As n! i < rii + 0(rii/ lg b//2 n), we have y = X][=i n 'i — n + 0(nj lg 6 / 2 n). By Lemma 5, we have 
Ui = Oiuij lg b / 2 n), so z = YH=i u i = 0(n/ lg 6 / 2 n). As each subregion has 0(lg 6 n) vertices, 
we also have z = Y7i=i u i = ^(n/ \g b n). By applying the inequality lg (™) < u\g^- + 
0(1) [24, Section 4.6.4], we have lg ( y ) = lg (p= in *) = 0(n\g\gn/ \g b/2 n). Thus space cost 

of B is 0{n\g\gn/ \g h l 2 n) + 0(n lglgn/lgn) = o(n) bits. The rank/select operations on B 
can be performed in constant time, thus in order to support the same operations on each 
Bi in constant time, it suffices to locate the starting position of any Bi in B in constant 
time. This can be done by using another bit vector, X, of length y to mark the starting 
positions of all the BiS in B. Thus the length of X is at most n + 0{n/ lg b//2 n) and it has 
r = 0(n/ lg a//2 n) Is, which can be stored using part (b) of Lemma 1 in o(n) bits. The same 
scheme can be used to concatenate and store all the CjS, and by a similar analysis, this 
occupies 0(n lglgn/ lg 6/,2 n + 0(nlglgn/ lgn)) = o(n) bits. The same bit vector X allows us 
to support rank/select on each Ci in constant time. Finally, as qi is less than the sum of the 
duplication degrees of all the vertices in Ri under the bottom-level partition, by Lemma 6, 
we have = 0{^Jni/ti) = 0{rii/ \g h ^ 2 n). As each element of Di is within the range [l..r^], 
it can be stored using O (lglgn) bits. Thus A occupies 0(ni lglgn/ lg b ^ 2 n) bits, so all the 
Z)jS occupy 0(nlglgn/lg 6/2 n) bits. They can be concatenated and stored using the same 
scheme with o(n) additional bits. 

We now show how to compute, given a vertex x with subregion-label k in subregion 
Rij, the region-label of x in Ri. We first locate the position, /, in Ai that corresponds to 
the occurrence of vertex x in subregion Rij. As the vertex with subregion-label 1 in Rij 
corresponds to position selector, j) in A i: we have / = select B .(l, j) + k — 1. We then 
retrieve Ci[T\. If it is 1, then the first occurrence of the region-label of x in A is at position 
/, thus rankc-(l,Z) is the result by the labeling scheme. Otherwise, the result is explicitly 
stored in Dj[ ran k c .(0, /)]. The above operations clearly take constant time. 

The second claim of the lemma can be proved similarly, and the space required is 
0(n/ lg a / 2-1 n) + o(n) bits, which is o(n) bits when a > 2. □ 



5 We use l to denote a bit sequence of I Os. 
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3.4 Answering Point Location Queries 

Lemma 8. Given a planar triangulation Gofn vertices, there is a succinct geometric index 
of o(n) bits that supports point location on G in O(lgra) time. 

Proof. We perform the two-level partitioning of G as in Section 3.2, and use the approach 
in Section 3.3 to assign labels to the vertices of G, but we do not store these labels. Instead, 
we sort the vertices by their graph-labels in increasing order, and store their coordinates as 
a sequence. We then show how to construct a succinct geometric index of o{n) bits. 

The succinct geometric index consists of three sets of data structures. This first set of 
data structures are the data structures constructed in Lemma 7 that supports conversions 
between subregion-labels, region-labels and graph-labels. By the proof of Lemma 7, they 
occupy 0(n/ \g a/2 ^n)+o(n) bits. The second and the third sets of data structures correspond 
to the top-level and the bottom-level partitions. 

To design the data structures for the top-level partition, we consider the graph S' con- 
structed by triangulating the graph consisting of the separator S and the outer face of G. S' is 
a planar triangulation of 0{n/ lg a n) vertices, so we can use the approach of Kirkpatrick [29] 
(in fact, any structure that uses 0(n\gn) bits for an n- vertex planar triangulation to answer 
point location in 0(\gn) time can be used here) to construct a data structure P of 0{nj lg a n) 
words (i.e. 0(n/lg a_1 n) bits) to support the point location queries on S'. Note that when 
we construct P, we simply use the graph-label of any vertex to refer to its coordinates, so 
that we do not store any coordinate in P. For each face of S', we store an integer. If this face 
is in region Ri, we store i. We store if it is a face in separator S. As there are 0(n/ lg a n) 
faces and the number assigned to each face can be stored in lgra bits, the space required to 
store these numbers is 0(n/ lg a_1 n) bits. All these (P and the numbers assigned to the faces 
of S') are the data structures for the top-level partition of G, and they occupy 0(n/ lg a_1 n) 
bits. 

The data structures for the bottom-level partition are constructed over the regions of 
G. Given a region recall that we construct a planar triangulation R f { when we perform 
the bottom-level partition. Consider the graph S[ constructed by triangulating the graph 
consisting of the separator Si (recall that it is a separator for i?-) and the outer face of i?-. We 
observe that S[ is a planar triangulation of 0{vij \g h ^ 2 n) vertices. As the number of vertices 
of S[ is 0(lg a n), a pointer that refers to a vertex of S[ can be stored in O(lglgn) bits. To 
refer to the coordinates of any vertex of S[, we only uses its region-label as we can compute 
its graph-label in constant time by Lemma 7, and O(lglgn) bits are sufficient to store a 
region-label. Thus we can use the approach of Kirkpatrick [29] to construct a data structure 
Pi of 0(rjlglgn/lg 6 / 2 n) bits to support the point location queries on S[ in O(lglgn) time. 
We also store a number for each face of S[, and this number is j if this face is in subregion 
Rij, and if it is a face in separator Si. As there are 0{r\j \g h ^ 2 n) = 0(\g a ~ b ^ 2 n) subregions 
in R'^ each number occupies O(lglgn) bits, so the space cost of storing these numbers 
is 0(rjlglgn/lg 6//2 n) bits. The space cost of these data structures for all the regions is 
J2 r i= i°( r ikkn/lg b/2 n) = 0(n\g\gn/\g b / 2 n) = o(n) bits. 

Therefore, the succinct geometric index constructed above occupies 0(n/ kg"/ 2-1 n)+o(n) 
bits. 
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We now show how to answer point location queries using this succinct geometric index. 
Given a query point x, we first locate the face of S' that contains x using the set of data 
structures constructed for the top-level partition in 0{}gn) time. We retrieve the integer, i, 
assigned to the face of S' that x is in. If i is 0, then this face is in S, and we return its three 
vertices as the result. If it is not, then x is inside region Ri. We then use the bottom-level 
partition to perform a point location query on the graph S- in O(lglgn) time using x as the 
query point. We retrieve the integer, j, assigned to the face of S' { that x is in. If j is 0, then 
this face is in Si, and we return its three vertices as the result (we need convert the region- 
labels of these three vertices to their graph-labels when we return them). If j is not, then x 
is inside region R it j. Using the bit vector Bi constructed in the proof of Lemma 7, we can 
compute the number of vertices of R it j in constant time. Recall that r it j denotes this number. 
By Lemma 7, we can compute the graph-label of any of these vertices in constant time, and 
thus retrieve its coordinates in 0(1) time. As we number these vertices using Lemma 3, we 
can use Lemma 3 to construct the graph R[ ■ in 0(r it j) time, and then check each of its faces 
to find out the face that x is in. This takes 0(r i: j) = 0(\g b n) time. Therefore, the entire 
process takes 0(\gn + lg b n) time. 



succinct geometric index of o(n) bits that supports point location queries in 0(\gn) time, 



Lemma 9. The data structures of Lemma 8 can be constructed in 0{n) time. 

Proof. We first show how to compute the order of the vertices in 0(n) time. To prove this, 
we need show that the two-level partition in Section 3.2 and the labeling of the vertices in 
Section 3.3 can be performed in 0{n) time. 

To show that the two- level partition can be performed in 0(n) time, we first observe that 
the computation of t-face separators at both levels can be performed in time linear in the 
numbers of the vertices of the graphs. Thus such computation can be performed in 0(n) time. 
The only part that is not clear is the time required to construct the graphs R^. Recall that 
we construct R\ by triangulating the graph that consists of Ri and the outer face of G, and 
that the boundary of Ri consists of one or more simple cycles: one simple cycle which is the 
outer face of Ri, and at most one simple cycle corresponding to each adjacent face component 
inside it. Thus we need only to triangulate the interior of each simple cycle corresponding to 
each adjacent face component inside Ri, and the face, F, defined by the simple cycle which 
is the outer face of Ri and the boundary of the outer face of G. To triangulate the interior 
of each simple cycle, we use the linear-time algorithm by Chazelle [14]. To triangulate F, we 
start at an arbitrary vertex y on the outer face of Ri. We locate a vertex of the triangular 
outer face of G such that the line segment between this vertex and y does not cross any edge 
of the outer face of Ri, and we draw an edge between x and this vertex. This divides F into 
two simple polygons, which can be triangulated in linear time. Thus we can construct the 
graph in time linear to the number of its vertices after we perform the top-level partition. 
Therefore, it takes 0{n) time to construct all the R'fi. 




which completes the proof. 



□ 
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The linear time construction of the succinct geometric index in Lemma 8 directly follows 
the linear time construct of Kirkpatrick's point location structure [29] and the data structure 
for part (b) of Lemma 1. □ 



Combining Lemmas 8 and 9, we have our first and main result: 

Theorem 1. Given a planar triangulation G of n vertices, there is a succinct geometric 
index of o{n) bits that supports point location on G in O(lgn) time. This index can be 
constructed in 0{n) time. 

We now design three variants of this succinct geometric index to address the query 
efficiency with different assumptions. We first consider the exact number of point-line com- 
parisons required to answer a query. 

Corollary 1. Given a planar triangulation G of n vertices, there is a succinct geometric 
index of o{n) bits that supports point location on G using at most lgn + 2y / Ign + 0(\g 1 ^ n) 
steps. This index can be constructed in 0{n) time. 

Proof. We use the same approach as that for Theorem 1, but we choose b = 1/4. When we 
construct the data structures for the top-level partition, we use the approach by Seidel and 
Adamy [36] to construct the point location structure P. This way point location on S' can 
be compute in at most lg n + 2y/lg n + OQg 1 / 4 n) steps. As the point location on S[ and 
can be supported in O(lglgn) and 0(\g b n) = 0(\g 1 ^ 4 n) steps, respectively, the overall steps 
required to answer point location queries on G is at most lgn + 2^/\gn + 0(lg 1//4 n). 

The other claims in this corollary are easy to prove. Note that when analyzing the prepro- 
cessing time, the super-linear preprocessing time of the approach by Seidel and Adamy [36] 
is not a problem, as we apply it to the graph S[, which has 0(n/ lg 3 n) vertices. Thus P can 
be constructed in 0(n/ lg 3 n x lg(n/ lg 3 n)) = 0(n/ lg 2 n) time. □ 

Note that Corollary 1 not only matches the best result [36] in terms of the exact number 
of point-line comparisons using negligible space, but also improves the preprocessing time 
from O(nlgn) to 0{n). 

If all the coordinates are integers bounded by U < 2 W , we have the following variant: 

Corollary 2. Assume that all the point coordinates in the plane are integers bounded by 
U < 2 W . Given a planar triangulation G of n vertices, there is a succinct geometric index of 
o(n) bits that supports point location on G in 0(min{lgn/ lglgn, y/lg U} + \g e n) time, for 
any constant e > 0. This index can be constructed in 0{n) time. 

Proof. We use the same approach as that for Theorem 1, but we choose b = e. When we 
construct the data structures for the top-level partition, we use the approach by Chan [13] 
and Patra§cu [33] to construct the point location structure P. This way point location on 
S' can be compute in 0(min{lgn/ lglgn, y/\g U}) time. As the point location on and R[j 
can be supported in O(lglgn) time and 0(\g b n) = 0(\g € n) time, respectively, the overall 
time required to answer point location queries on G is 0(min{lgn/ lglgn, y/lgU} + lg e n). 

These data structures can still be constructed in 0(n) time, as the structure by Chan [13] 
and Patra§cu [33] can be constructed in linear time. □ 
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We then consider the case where query distribution is known. 

Corollary 3. Given a planar triangulation G of n vertices, there is a succinct geometric 
index of o(n) bits that supports point location on G in 0(H + 1) expected time. This index 
can be constructed in 0(n) time. 

Proof. If the probability of a face or a set of faces containing a query point is p, we say that 
this face or this set of faces has probability p. In this proof, we define the t-face separator 
to be the set of faces of G whose removal partitions G into adjacent face components each 
of which has probability at most t. We consider graph G*, which is the dual graph of G 
excluding the vertex corresponding to the outer face of G and its incident edges, and assign 
the probability of each face of G to its corresponding vertex in G*. By Lemma 2, the following 
lemma is immediate: 

Lemma 10. Consider a planar triangulation G of f internal faces, with probability associ- 
ated to each face. For any t such that < t < 1, there is a t-face separator consisting of 
0{yj f ft) faces that can be computed in 0{n) time. 

Observe that Lemmas 5 and 6 also apply to t-face separators for graphs whose faces are 
associated with probabilities. This is because we prove these two lemmas by bounding the 
number of edges in the separator, which has nothing to do with probabilities. 

We choose t = \g 3 f / f to apply Lemma 10 to G. Let S" be the t-face separator. Then 
S" has 0{n/ lg 3//2 n) vertices. We call each adjacent face component of G \ S a super region. 
Thus the number of super regions 0(n/lg 3 ^ 2 n) vertices, and the sum of the duplication 
degrees of the boundary vertices of all the super regions is also 0(n/ lg 3 ^ 2 n). Note that we 
can no longer prove that each super region has o(n) vertices; this is because a super region 
can have a large number of faces with very low probabilities. Thus we further perform a 
two-level partition on each super region as in the proof of Lemma 8. Therefore, we actually 
perform a three-level partition on G, and we call them first-level, second-level and third-level 
partitions from top down. It is straightforward to extend the techniques in Section 3.2 to this 
case to compute the permuted sequence of the vertices, and to perform conversions between 
the labels assigned to the same vertices at different levels of the partition. 

For the first level partition, we construct a triangulated graph G" by triangulating the 
graph consisting of S" and the outer face of G. To assign a probability to each face of 
G", initially we let the probability of each face of S" to be the same as its probability in 
G, and let the probability of any other internal face to be l/n. However, the sum of all 
the probabilities of the faces of G" can be larger than 1, though it is at most 2. We thus 
reduce the probability of each face of G" by a constant ratio, so that the sum becomes 
1. It is clear that the above process reduces the probability of each face by at most half. 
Therefore, the probability of each face of S" in G" is at least half of that in G, and the 
probability of each internal face of G" that is not in S" is at least l/(2n). We construct a 
point location structure, P" , for G" using the approach of Iacono [26], or any linear-space 
structure that answers point location in 0(lg(l/p)) time, if the query point is contained in a 
face of probability p. P" occupies 0(n/ lg 1//2 n) bits. We also store additional information for 
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each face of G" to indicate whether it is a face in S" and, if not, which super region it is in. 
For the second-level and third-level structures, we construct data structures similar to those 
constructed in Lemma 8. The algorithm to answer point location queries is similar, except 
that we now perform operations at three levels of partition. 

To analyze the query time, it is sufficient to show that, if the face, z, of G that con- 
tains the query point x has probability p, the query can be answered in deterministic 
time 0(min{\gn, lg(l/p)}). There are two cases. First, z is a face in S". In this case, 
we need only use P" to retrieve the result. By Iacono's result [26], the time required 
is 0(min{lgn, lg(l/p')}, where p' is the probability of z in G" . By the analysis in the 
above paragraph, p' > p/2. Thus the claim is true in this case. Second, z is not a face 
in S". In this case, the query is answered in O(lgn) time. Thus it suffices to prove that 
0(min{lgn, lg(l/p)}) = O(lgn). Recall that each super region has probability at most 
t = lg 3 n/n. As z is part of a super region, we have p < lg 3 n/n. Thus lg(l/p) > lgn — 3 lglgn, 
and the claim follows. 

It is straightforward to show that the space cost is o(n) bits and that the preprocessing 
time is 0(n). □ 

4 Point Location in Planar Subdivisions 

We now generalize the techniques in Section 3 to design succinct geometric indexes sup- 
porting point location queries in general planar subdivisions. In this section, we adopt the 
assumption that a planar subdivision is contained inside a bounding simple polygon (i.e. it 
does not have any infinite faces), and each face is also a simple polygon. 

4.1 Encoding a Planar Subdivision by Permuting Its Point Set 

We now generalize Lemma 3 to the case of planar subdivisions. 

Lemma 11. Given a planar subdivision of n vertices for sufficiently large n, there exists an 
algorithm that can encode it as a permutation of its point set in 0{n) time, such that the 
subdivision can be decoded from this permutation in 0{n) time. 

Proof. To encode the given planar subdivision G, we first surround it using a bounding 
triangle, triangulate it, and denote the resulted graph by T. This process takes 0{n) time 
using the approach in the proof of Lemma 9. Lemma 3 is sufficient to encode T, but we need 
encode more information in order to decode G as we add more edges when triangulating it. 
We show how to encode and decode such information in the rest of this proof. 

To use Lemma 3 to encode T, we first compute a maximal independent set, J, of vertices 
of T with degree at most 6. The size of / is at least n/10 as shown by Denny and Sohler [19]. 
We remove the vertices in I from T and re-triangulate T. We then visit the new triangulation 
in a canonical way such as BFS, and order the vertices in / by the order of the triangles that 
contain them. We divide the vertices in / into sets of the same constant size, and permute 
each set to encode enough information so that given the new triangulation and a vertex in 
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/ (it is sufficient to visit all the faces of the new triangulation to determine each face that 
contains a vertex in I), we know how to insert it into the new triangulation to reconstruct 
T. As each vertex in / has degree at most 6, its removal from T creates a polygon of size at 
most 6. Based on this, Denny and Sohler [19] proved that there are at most 41 possibilities of 
inserting a vertex. Thus each set of I only has to be large enough so that the permutations 
of its subsets is sufficient to encode lg41 bits for each point in I. We continue this process 
until all the vertices except the vertices in the outer triangular face are removed. 

To modify the above process to encode enough information to indicate which edge of 
T is present in G, we observe that, to decode T, when we insert a vertex into the current 
triangulation, we determine its neighbours in the previous version of this triangulation, 
remove the edges in the polygon defined by these neighbours, and draw an edge between 
this vertex and each of its neighbours. Therefore, we draw at most 6 edges when we insert 
a vertex. To encode whether each of these edge is an edge in G \ T, we need 6 bits. Thus 
we make each subset of / to be large enough to encode lg 4 1 + 6 bits of information for each 
vertex in it. When we decode T, each time we insert a vertex to the triangulation and draw 
an edge between it and one of its neighbours, we use the encoded information and store a 
flag for each edge to indicate whether it is an edge in G \ T. Once we decode T, we visit all 
its edges to remove those in G \ T to get G. 

The processes of encoding and decoding clearly take 0{n) time. □ 

4.2 Partitioning a Planar Subdivision by Removing Faces 

We first generalize the t-face separators defined for planar triangulations to planar subdi- 
visions. Note that the definition of adjacent face component in Section 3.1 can be directly 
applied to planar subdivisions. We have the following definition: 

Definition 2. Consider a planar subdivision G with f internal faces. A t-face separator 

of G is a set of its internal faces of size 0(^/ f/t) whose removal from G leaves no adjacent 
face component of more than tf faces. 

Observe that in the proof of Lemma 4, we do not make use of the fact that each face of 
a planar triangulation is a triangle. Thus we have the following lemma: 

Lemma 12. Consider a planar subdivision G with f internal faces. For any t such that 
< t < 1, there is a t-face separator consisting of 0(a/ '/ Jt) faces that can be computed in 
0(n) time. 

We also define the notion of boundary, boundary vertex, internal vertex and duplication 
degree on planar subdivisions as in Section 3.1. Same as the case of planar triangulations, we 
can bound the number of adjacent face components and the sum of the duplication degrees 
of boundary vertices after removing a t-face separator from a planar subdivision. The only 
difference is when we count the number of edges in the t-face separator, we can no longer 
use the fact that each face has 3 edges. Instead, we make use of the maximum number of 
vertices of any internal face of the planar subdivision to bound these two values. Therefore, 
we have the following two lemmas: 
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Lemma 13. Consider a planar subdivision G with f internal faces and a t-face separator S 
constructed using Lemma 12. The number of adjacent face components of G\S is 0(ky/fjt), 
where k is the maximum number of vertices of any internal face of G. 

Lemma 14. Consider a planar subdivision G with f internal faces and a t-face separator S 
constructed using Lemma 12. The sum of the duplication degrees of all its boundary vertices 
is 0{k^J f/t), where k is the maximum number of vertices of any internal face ofG. 

4.3 The Two-Level Partitioning Scheme 

We now use Lemma 12 to partition the planar subdivision G. Recall that n, m and / denote 
the numbers of vertices, edges and internal faces of G, respectively. However, we cannot use 
Lemma 12 directly, as / can be as small as 1 for any n. Instead, we divide the faces of the 
planar subdivision that have sufficiently many vertices into smaller faces whose sizes are 
bounded by non-constant parameters. It may seem odd not to simply divide the faces into 
triangles, but it is crucial to choose a non-constant parameter for our solution. We have the 
following lemma: 

Lemma 15. Consider a simple polygon P of n vertices. Given an integer I where I < n, 
there is an 0(n)-time algorithm that can, by adding edges between the vertices of P that only 
intersect at the vertices of P, divide the interior of P into a planar subdivision such that 
each internal face has at least I vertices (with the exception of at most one internal face) and 
at most 31 vertices. 

Proof. We first triangulate P in 0(n) time [14] and denote the resulted graph by P' . Note 
that we do not add a triangular outer face when triangulating P. As each internal face of P' 
is a triangle, each vertex of the dual graph, P*, of P' (without considering the outer face) has 
degree at most 3. The BFS tree, T, of P* is thus a binary tree. It only suffices to prove that 
we can partition T into subtrees of size at least / (with exception of at most one subtree) 
and at most 3/ in 0(n) time. One way of achieving this is to apply the partition algorithm 
of Munro, Raman and Storm [32]. □ 

With the above lemma, we can now present our partitioning scheme. We choose / = 
lg 2 n/3 and use Lemma 15 to divide each internal face of G that has more than lg 2 n vertices 
into smaller faces. We denote the resulted graph by G' . Thus any internal face of G' has at 
most lg 2 n vertices. We call a face of G' that is a face of G an original face, and a face of G' 
that is part of a larger face of G a modified face. By Lemma 15, among the modified faces 
of G' that are in the same face of G, there is at most one modified face that has less than / 
vertices. Therefore, the total number of modified faces of G' is 0(f/l) = 0(n/\g 2 n). Let /' 
be the number of internal faces of G'. Then we have 4(2n — 5)/ lg 2 n < f'<2n — 5. 

For the top-level partition, we choose t = lg 8 /'//' to apply Lemma 12 on G' . Then the 
t-face separator, S, of G' has 0(y/f'/t') = 0(f'/\g A f) = 0{n/\g A n) faces. As each face of 
G' has at most lg 2 n vertices, the number of vertices in S is at most 0(n/ lg 2 n). We call each 
adjacent face component of G' \ S a region. By Lemma 13, there are at most 0{nj lg 2 n) 
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regions. The number of faces of each region is at most tf = lg 8 /' = 0(lg 8 n), so each region 
has at most O(lg 10 n) vertices. By Lemma 14, the sum of the duplication degrees of the 
boundary vertices of all the regions is 0(n/ lg 2 n). 

Consider a region Ri. Let fi and be the number of faces and vertices of Ri, respectively. 
Then f\ = 0(lg 8 n) and n» = 0(lg 10 n). We choose I = lg 1//4 n/3 to apply Lemma 15 to divide 
each internal face of Ri that has more than lg 1//4 n vertices into smaller faces. We denote the 
resulted graph by R^. Thus any internal face of R\ has at most lg 1 / 4 ?! vertices. We call a 
face of R'i that is a face of Ri an original region face, and a face of R\ that is part of a larger 
face of Ri a modified region face. Same as the analysis for G' , we have that the number of 
modified region faces in Ri is 0{ni/ lg 1 / 4 ?!), so the total number of modified region faces in 
all the regions of G' is 0(n/ lg 1//4 n). Let f[ be the number of internal faces of R\. We also 
have 4(2ni - 5)/ lg 1/4 n < fi < 2n; - 5. 

We perform bottom- level partition on each region Ri. Let tj = lg 3 ^ 4 nj f[. We use Lemma 12 
to compute a tj-face separator, Si, for R\. Then Si has 0(\J fl/ti) = 0{f' i / lg 3 / 8 n) faces, so Si 
has 0{f' i / lg 1,/8 n) vertices. We call each adjacent face component of R'i\Si a subregion of R\ 
(or and we denote the j th subregion of i?- by R i: j. By Lemma 13, for region Ri, there are 
at most 0(lg 1/4 n x \ffjtf) = 0{f[j lg 1/8 n) subregions. As f[ < 2n» - 5, the total number 
of subregions of all the regions of Gi is 0(n/ lg 1 / 8 n). The number of faces of each region is 
at most tifl = lg 3 / 4 n, so each region has at most lgn vertices. By Lemma 14, the sum of 
the duplication degrees of the boundary vertices of all the subregions of Ri is 0{ni/ lg 1 / 8 n), 
so the sum of the duplication degrees of all the boundary vertices of the subregions in the 
entire graph G is 0(n/ Ig 1 ^ 8 n). 

4.4 Vertex Labels and Face Labels 

We now design a labeling scheme for the vertices based on the two-level partition in Sec- 
tion 4.3. Same as the case of planar triangulations, we assign a distinct number called graph- 
label from the set [n] to each vertex x of G. Each vertex x also has a region-label for each 
region Ri it is in, which is a distinct number from the set [n«]. The subregion-label of x is 
defined similarly at the subregion level. 

We use the techniques in Section 3.3 to assign the labels from bottom up. To assign the 
subregion-labels, given a subregion Rij, we use Lemma 11 to permute its vertices. If a vertex 
x in Ri j is the k th vertex in this permutation, then the subregion-label of x in R it j is k. With 
the subregion labels assigned, we use exactly the same process in Section 3.3 to compute the 
region-labels and graph-labels of all the vertices. 

As we use the same technique in Section 3.3 to label the vertices (except that we use 
Lemma 11 instead of Lemma 3), the techniques of Lemma 7 can be used to perform constant- 
time conversions from subregion-labels (or region-labels) to region-labels (or graph-labels). 
The analysis of the number of regions/subregions and the sums of the duplication degrees 
of the boundary vertices of the regions/subregions of G in Section 4.3 guarantees that the 
space required is still o{n) bits. Thus we have the following lemma: 

Lemma 16. There is a data structure of o{n) bits such that given a vertex x as a subregion- 
label k in subregion Rij, the region-label of x in Ri can be computed in 0(1) time. Similarly, 
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there is a data structure of o(n) bits such that given a vertex x as a region-label k in region 
Ri, the graph-label of x can be computed in 0(1) time. 

For planar subdivisions, we also need design a labeling schemes for its faces. We do not 
have to do this for planar triangulations, because in that case, each face has three vertices, 
and it is sufficient to locate these three vertices to return the face. However, we cannot do 
so for general planar subdivisions, because a face may have a large number of vertices, and 
it may take too much time to return all these vertices. 

For each face of G, we assign a distinct number called graph-id from the set [/]. This is 
the identifier we return when answering point location queries. For each face in a region Ri, 
we also assign a distinct number called region-id from the set [fi\. Note that a face of the 
region Ri is not necessarily a face of G (i.e. it can be a modified face instead of an original 
face). For each face in a subregion R it j, we assign a distinct number called subregion-id from 
the set [fij], where fcj is the number of faces in R it j. Again a face of R it j is not necessarily 
a face of Ri. 

We number the faces from bottom up. For each subregion R it j, we list its faces by a 
canonical order (such as BFS order) of the corresponding vertices of its dual graph. The k th 
face listed is assigned k as its subregion-id in R it j. To assign identifiers to a face x in region 
Ri, there are two cases. First, we consider the case where x is in at least one subregion of 
Ri, or part of it is a modified region face that is in at least one subregion of Ri. Let be the 
number of such faces. We assign a distinct number from [<&] to each such face as its region-id 
by computing a permuted sequence of all these faces as follows. We visit each subregion Rij, 
for % — 1, 2, • • • , Ui, where Ui is the number of subregions in Ri. When we visit R%,j, we list all 
its faces sorted by their subregion-ids in increasing order. As some faces of Rij are modified 
region faces, we replace these modified region faces by the faces of Ri that they are in. This 
way after we visit all the subregions of R^j, we have a sequence of the faces of Ri that are in 
this case. Note that each face of Ri may occur multiple times in this sequence, and by only 
keeping its first occurrence in the sequence, we have a permuted sequence of such faces. We 
assign number k to the k th face of Ri as its region-id. Second, for the case where x or parts 
of it only exist in Si, we arbitrarily assign a distinct number from the set q i+i , ■ ■ ■ , fi] to 
each such face as its region-id. The approach to assign graph-ids to the faces of G is similar, 
except that we perform the above process for the top-level partition. 

Given a face of a subregion (or region) and its subregion-id (or region-id), we need find 
the identifier of the face in the corresponding region (or in G) that contains this face. To do 
this, we have the following lemma: 

Lemma 17. There is a data structure of o(n) bits such that given a face x with subregion-id 
k in subregion Rij, the region-label of the face in Ri that contains x can be computed in 0(1) 
time. Similarly, there is a data structure of o(n) bits such that given a face x with region-id 
k in region Ri, the graph-id of the face of G that contains x can be computed in 0(1) time. 

Proof. We only show how to prove the first part of this lemma; the second part can be proved 
similarly. 
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We use the same notation as in the previous part of this Section. As only faces of Ri that 
have more than lg 2 n vertices are divided into modified region faces, the number of modified 
region faces of R[ is Oiriij lg 2 n). Thus // — fi + 0(rii/ lg 2 n). 

Observe that when we compute the region-ids of the faces of Ri, we construct a conceptual 
array of length // (i.e. the sequence we obtain before we remove the multiple occurrences of 
faces in it to obtain the permuted sequence of the faces of Ri). We denote this array by E^. 
Ei have the answers of our queries, but we cannot afford storing it explicitly. Instead, we 
construct the following data structures for each region Rf 

— A bit vector Fj[l..//] which stores the numbers f iti , f ij2 , • • • , fi >Ui in unary, i.e. Fj = 
10 /i ' 1-1 10 /i ' 2_1 • • • lO^'"* -1 ; 

— A bit vector Jj[l..//] in which Ji[k] = 1 iff the first occurrence of the region-id E[k] in E± 
is at position k (let Zi denotes the number of Os in Jj); 

— An array Ki[l..Zi] in which Ki[k] stores the region-id of the face that corresponds to the 
k th in Ji, i.e. K^k] = Ei [select^ (0, k)]. 

The analysis of the space costs of these data structures is similar to that in the proof of 
Lemma 7. We can prove that space costs is o(n) bits. The same algorithm in Lemma 7 can 
be used to compute the region-id of x in constant time. □ 

4.5 Supporting Point Location Queries 

Theorem 2. Given a planar subdivision G of n vertices, there is a succinct geometric index 
of o(n) bits that supports point location on G in 0(\gn) time. This index can be constructed 
in 0(n) time. 

Proof. We preform the two-level partitioning of G as in Section 4.3, and use the approaches 
in Section 4.4 to assign labels to the vertices and faces of G, but we do not store these labels 
explicitly. Instead, we sort the vertices by their graph-labels in increasing order, and store 
their coordinates as a sequence. We then show how to construct a succinct geometric index 
of o(n) bits. 

The succinct geometric index consists of three sets of data structures. This first set of data 
structures are the o(n)-bit data structures constructed in Lemmas 16 and 17 that supports 
conversions between subregion-labels (or subregion-ids) , region-labels (or region-ids) and 
graph-labels (or graph-ids). The second and the third sets of data structures correspond to 
the top-level and the bottom-level partitions. 

To construct the data structures for the top-level partition, we consider the graph S' that 
can be constructed by triangulating the graph consisting of the separator S and the outer 
face of G. S' is a planar triangulation of 0(n/ lg 2 n) vertices, so we can use the approach of 
Kirkpatrick [29] to construct a data structure P of 0(n/ lg 2 n) words (i.e. 0(n/ lgn) bits) to 
support the point location queries on S'. Note that when we construct P, we simply use the 
graph-label of any vertex to retrieve its coordinates, so that we do not store any coordinate 
in P. For each face of S', we store an integer and a bit. If this face is in region Ri, we store % 
and a bit 1. If it is a face in separator S, we store the graph-id of the face of G that contains 
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this face and a bit 0. As there are 0(n/ lg 2 n) faces and the number assigned to each face 
can be stored in lgn bits, the space required to store these numbers and bits is 0(n/\gn) 
bits. All these (P and the values assigned to the faces of S') are the data structures for the 
top-level partition of G, and they occupy 0{n/\gn) bits. 

The data structures for the bottom-level partition are constructed over the regions of 
G. Given a region R iy we consider the graph that can be constructed by triangulating 
the graph consisting of the separator Si and the outer face of G. Then S^ has 0{rii/ lg 1 ^ 8 n) 
vertices, so a pointer that refers to a vertex of S[ can be stored in O(lglgn) bits. To refer 
to the coordinates of any vertex of S[, we only uses its region-label as we can compute its 
graph-label in constant time by Lemma 16, and O(lglgn) bits are sufficient to store a region- 
label. Thus we can use the approach of Kirkpatrick [29] to construct a data structure Pi of 
0(njlglgn/lg 1//8 n) bits to support the point location queries on S^ in O(lglgn) time. We 
store a number for each face of S*-, and this number is j if this face is in subregion R it j, and if 
it is a face in separator Si, we explicitly store the region-id of the face containing it. We also 
use a bit to indicate whether a face of £■ is in a subregion or not. The space cost of storing 
these numbers and bits is 0(rii lg lgn/ lg 1 ^ 8 n) bits. The space cost of these data structures 
for all the regions is 0(n lglgn/lg 1 / 8 n) = o(n) bits. 

Therefore, the succinct geometric index constructed above occupies o(n) bits. 

To support point location queries, given a query point x, we first use the set of data 
structures constructed for the top-level partition to perform a point location query on the 
graph S' in O(lgn) time using x as the query point. This tells us whether a; is in a face 
of S or not. If it is, we also have the graph-id of this face and we return it as the result. 
Otherwise, we get the number of the region that x is in. Assume that x is in region We 
then use the bottom-level data structures to perform a point location query on the graph S[ 
in O(lglgn) time using x as the query point. Similarly, this tells us whether x is in a face of 
Si or not. If it is, we also have the region-id of this face, and we compute its graph-id using 
Lemma 17 in constant time and return it as the result. Otherwise, we get the subregion that 
x is in. Assume that it is Rij. As each subregion has at most lgn points, and by Lemma 16, 
we can compute the graph-label of any of these vertices in constant time, and thus retrieve 
its coordinates in 0(1) time, we can use Lemma 11 to construct the graph R'^ in O(lgn) 
time, and then check each of its faces to find out the face that x is in. The subregion-id of 
this face can be determined from the dual graph, and can be used to compute its graph-id 
in constant time. Therefore, the entire process takes O(lgn) time. 

Same as the analysis in Lemma 9, the preprocessing time is 0(n). □ 

We can use this theorem to solve the following problem. Given a simple polygon and a 
query point, we want to test whether the polygon contains the query point. This is called 
the membership query on the polygon. 

Corollary 4. Given a simple polygon of n vertices, there is a succinct geometric index of 
o(n) bits that supports membership query on the polygon in O(lgn) time. This index can be 
constructed in 0(n) time. 

Proof. We simply choose an orthogonal rectangle in the plane that contains this polygon. 
This rectangle and the polygon form a planar subdivision G of two faces. We apply Theorem 2 
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to G. Given a query point x, we can check whether it is inside or outside the rectangle in 
constant time. If it is inside the rectangle, we further use the succinct geometric index for G 
to decide which face it is in. □ 



5 Applications 

5.1 Vertical Ray Shooting Query 

Given a set of disjoint line segments, we can build its trapezoidal decomposition in 0{n\gn) 
time [18]. By answering point-location query in a trapezoidal decomposition, the trapezoid 
containing the query point is reported. Alternatively, we can return the line segments defining 
the upper and the lower edges of the trapezoid. This query is referred as the vertical ray 
shooting query. 

Theorem 3. Given a set of disjoint line segments in the plane, there is a succinct geometric 
index of o(n) bits that supports vertical ray shooting on this set in 0(\gn) time. This index 
can be constructed in 0{n\gn) time. 

Proof. We first build the trapezoidal decomposition of the plane with the set of given line 
segments, which takes 0{n log n) time [18]. In this planar subdivision, each face is a trapezoid; 
each vertex is determined by at most two line segments. 

Similar to building the succinct index for planar triangulations, we apply the two-level 
partitioning scheme on the trapezoidal decomposition. In the top-level partition, 0(n/ lg 3 n) 
trapezoids are selected. The line segments defining selected trapezoids are grouped together. 
The rest of line segments are grouped based on their regions. In the bottom-level partitions, 
Oinj log^n) trapezoids are selected in total. Again, line segments are grouped together for 
the separator and each subregions. Different from the triangular subdivision, each vertex is 
defined by at most two line segments. In the data structure, it is represented by two labels 
and a flag. Similar to the extra data structure used in the triangular subdivision, we also 
build point location structures for both levels and this index only takes o(n) bits in total. 

Inside subregions, we handle differently. We simply group all line segments in the same 
subregion in an arbitrary order. We can use a similar labeling scheme as in Section 3.3. 
This works because each time a line segment is in two different regions (or subregions), it 
must contain at least one vertex in the separator of the graph (or of the region). As no two 
given segments intersect, we can use this fact to bound the sum of the number of regions (or 
subregions) that a line segment can be in. 

When processing the query, we only use the additional data structure to locate the 
trapezoid containing the query point in the top-level and bottom-level partitions, and also 
locate the subregion containing the query point. In the subregion, we scan through all line 
segments, and determine the line segments above (below) the query point. Comparing with 
the line-segments above (below) the query point in two levels, we obtain the real answer. □ 
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5.2 Implicit Geometric Data Structures 

Implicit geometric data structures have been studied since 2000 [7, 8, 9, 12]. For example, 
several implicit 2-d nearest neighbor query structures (this is equivalent to supporting point 
location on Voronoi Diagrams) have been proposed [8, 12]. However, implicit point location 
structures for planar subdivisions are still unknown. We can apply succinct geometric indexes 
to solve this problem. 

We apply the well-known bit encoding technique used in many previous work on in-place 
algorithms and implicit data structures (e.g. [31]): divide the array of input into consecutive 
pairs. We permute each pair of the data. In the lexical order, if the first datum is smaller, 
is encoded, otherwise 1 is encode. (Assume we have removed all duplicates.) Retrieving one 
bit in this encoded data structure requires 0(1) time, and retrieving a pointer of size O(lgn) 
requires O(lgn) time. 

Before encoding the succinct data structure, in the input array, we put each pair of the 
data in the lexical increasing order. To ensure the data structure is still valid above the 
above permutation, when applying the separator theorem, we need to ensure all regions 
(and subregions) have an even size. If one region has an odd size, we can move one of the 
element in that region into the separator. When computing the sequence to construct the 
triangulation of a sub-region, we ensure the number of vertices selected in one round in [19] 
is even, and every time two vertices are removed from the triangulation together. 

In the succinct index, we store labels based on the sequence described above. By permut- 
ing each pair of data, we can encode l/2n bits in the array. Therefore, we encode the succinct 
data structure of size o(n) in the input array, for a n large enough. For point location queries, 
the input array is a set of vertices, and for vertical ray shooting queries, the input array is 
a set of line segments. 

When we answer queries, we locate the pair of data in the array, put them in the lexical 
increasing order, then locate the datum based on the label stored in the succinct index. Then 
we have: 

Theorem 4. Given a planar triangulation of n vertices, there is a permutation of the vertex 
coordinates array that can support point location in 0(lg 2 n) with 0(1) words of working 
space. 

Similarly, we can design implicit data structures to answer point location queries in an 
arbitrary planar subdivision and support vertical ray shooting. These data structures are 
represented as a permutation of the input array, and answering point location queries in 
these data structures takes 0(lg 2 n) time. 

6 Conclusion 

In this paper, we start a new line of research by designing succinct geometric indexes. We 
design a succinct geometric for triangular planar subdivision that occupies o{n) bits that, by 
taking advantage of the points permuted and stored elsewhere as a sequence, to support point 
location in O(lgn) time. We also considered the exact number of point-line comparisons, 
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integer coordinates from a bounded universe, and the case where the query has a certain 
distribution, by designing three succinct geometric indexes for them. We also generalize our 
techniques to planar subdivisions, and apply them to design succinct geometric indexes for 
vertical ray shooting and the design of implicit data structures supporting point location in 
0(lg 2 n) time. In addition, we believe that our techniques are practical. This is because several 
previous results we use have practical implementations, such as practical bit vectors [21], 
and we can group constant number of subregions to have enough vertices in order to apply 
Lemma 3. Thus we expect our technique to influence the design of space-efficient geometric 
data structures. 

There are a few open problems. First, the index we design for the case where the query 
distribution is known supports point location in 0(H + 1) expected time. Thus it is an 
open problem to improve this to H + o(H) expected number of comparisons. Another open 
problem is to design succinct geometric indexes for other types of queries, such as general 
ray shooting. 
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